Estimation of genetic variation in yield, its contributing characters and capsaicin content of Capsicum chinense Jacq. (ghost pepper) germplasm from Northeast India

Capsicum chinense Jacq. (ghost pepper), a naturally occurring chili species of Northeast India is known throughout the world for its high pungency and a pleasant aroma. The economic importance is due to the high capsaicinoid levels, the main source for pharmaceutical industries. The present study focused on identifying important traits necessary for increasing the yield and pungency of ghost pepper and to determine the parameters for the selection of superior genotypes. A total of 120 genotypes with more than 1.2% capsaicin content (>1,92,000 Scoville Heat Unit, w/w on dry weight basis) collected from different northeast Indian regions were subjected to variability, divergence and correlation studies. Levene’s homogeneity test of variance studied for three environments did not show significant deviation and so homogeneity of variance was reasonably met for analysis of variance study. Genotypic and phenotypic coefficient of variation was highest for fruit yield per plant (33.702, 36.200, respectively), followed by number of fruits per plant (29.583, 33.014, respectively) and capsaicin content (25.283, 26.362, respectively). The trait number of fruits per plant had maximum direct contribution to fruit yield per plant and the trait fruit yield per plant towards capsaicin content in the correlation study. High heritability with high genetic advance, which is the most favored selection criteria was observed for fruit yield per plant, number of fruits per plant, capsaicin content, fruit length and fruit girth. The genetic divergence study partitioned the genotypes into 20 clusters, where fruit yield per plant contributed maximum towards total divergence. Principal components analysis (PCA) studied to determine the largest contributor of variation showed 73.48% of the total variability, of which the PC1 and PC2 contributed 34.59% and 16.81% respectively.


INTRODUCTION
The northeast region of India is a home to rich diversity of Capsicum domesticated species and among them C. chinense Jacq. (ghost pepper or bhut jolokia or Naga King chili) is immensely popular for its unique aroma and high pungency (Sarwa et al., 2013;Baruah et al., 2019). The region with unique ecological conditions and high humidity, acting as centre of speciation has given rise to the hottest pepper in the world (Guinness Book of World Records, 2007). The genus Capsicum (family Solanaceae), a new world crop is represented by thirty-five species (Barchenger, Naresh & Kumar, 2019). It was introduced to India at the end of the 17th century by Portuguese explorers and to Northeast India by Christian missionaries (Basu & De, 2003). Ghost pepper, a semi-perennial species, is a naturally occurring variety of Northeast India Baruah & Lal, 2020). Besides its use as a flavoring agent, the capsaicin extracted from this plant species has many pharmacological applications (Meghvansi et al., 2010).
Ghost pepper shows wide variation in morphological characteristics (Purkayastha et al., 2012). According to literature reports, ghost pepper has higher capsaicin concentration than other Indian chili varieties (Baruah et al., 2014;Baruah et al., 2019), making it a potential crop for the extraction of oleoresin capsicum and capsaicin for commercial uses (Meghvansi et al., 2010;Baruah & Lal, 2020;Bhandari et al., 2021). Low capsaicin-yielding varieties are not suitable for commercial cultivation because of the bottleneck associated with a high cost of capsaicin extraction (Baruah & Lal, 2020). However, despite its advantages, no measures were taken for the improvement of this important crop as the production is not up to the demand to meet the quality requirements, such as high fruit yield, high capsaicin, etc. It may be due to the lack of superior varieties available in the public domain or the use of low-quality capsaicin content lines for cultivation. To overcome this, information on morphological characteristics, evaluation of a large number of germplasm, diversity study and creation of gene bank is pre-requisite, allowing the exploration of genetic variability efficiently (Santos-Pessoa et al., 2018;Karim et al., 2022). Heritability criteria determine the extent to which it is transmissible from parents to offspring and is mostly preferred when used in association with other parameters (Afiah, Mohammad & Saleem, 2000;Munda, Dutta & Lal, 2021;Begum et al., 2022). Path coefficient analysis helps in identifying useful traits associated with yield based on the direct and indirect effect of the traits on economical traits (Seyoum, Alamerew & Bantte, 2012;Pandey et al., 2022). Also, to select suitable advanced cultivars within a short period of time, the information on all the genetic parameters will be very much helpful for the breeder (Dutta et al., 2017;Karim et al., 2022). The variability present in the germplasm is used for effective selection of diverge parents, which in turn would be helpful to obtain hybrids with greater heterotic effect (Correa & Gonçalves, 2012). The degree of genetic variability can be measured using Mahalanobi's D 2 analysis, which is a powerful tool for determining the relative contribution of different traits (both inter and intra cluster level) on total divergence in self-pollinated plants (Hasan et al., 2015;Munda, Dutta & Lal, 2021). The knowledge of morphological as well as genetic diversity is very much essential for initiating any breeding programme which focuses on the development of superior varieties. Many reports were available in the public domain regarding the variability and diversity study of different Capsicum species. However, very few reports were accessible regarding this important plant with small population, which cannot be considered reliable as small sample size often produces skewed results (Baruah et al., 2019). So, there is a need for proper scientific study on genetic variability, heritability, genetic advance, and interrelationship among economically important traits along with their direct and indirect effect on fruit yield and capsaicin content through path studies. Therefore, a planned breeding experiment was conducted to identify the selection criteria to develop high-yielding and higher capsaicin content lines of this industrially important crop.

Planting material and experimental design
The experiment was carried out at CSIR-NEIST (North East Institute of Science and Technology) experimental farm, Jorhat, Assam, India (26 • 44 N, 94 • 9 E, 94 m a.s.l.). One hundred and twenty (120) genotypes with more than 1.2% (1,92,000 SHU) capsaicin content on dry weight basis were selected from an initial collection of 227 germplasm (Baruah et al., 2019), which were planted in randomized complete block design (RCBD) with three replications during three consecutive years i.e., kharif 2017, kharif 2018 and kharif 2019. Among them genotypes-RRL-BJ-102 and 18 were brown variants and RRL-BJ-20 and 25 were yellow variants while RRL-BJ-92 and 58 were round-shaped red variants (Fig. S1). For capsaicin estimation, fully ripe chilies per plot were harvested, followed by immediate drying to retain their quality, such as intact red colour, pungency, etc. A total of 16 plants from each genotype were planted in a plot size of 2.5 × 3 m, with 60 × 60 cm plant-plant and line-line spacing. As recommended fertilizer dose (NPK) of 120:80:60 kg/ha/year was applied in the experiment. All standard agronomical practices were followed to raise a good crop. Morpho-agronomic characterization was done based on IPGRI (International Plant Genetic Resources Institute, 1995) report on Capsicum species. For all the studied traits, morphological data were collected in triplicate during kharif 2017 and their average was calculated. The same process was followed during kharif 2018 and 2019. The data obtained from three years were then pooled, and their average value was taken for final statistical analysis. Meteorological data recorded during the study years was presented in Table S1.

Traits studied
Data were recorded for 11 traits, viz-vegetative plant height (cm), number of main branch, leaf length (cm), leaf breadth (cm), fruit length (cm), fruit girth (cm), number of fruits per plant, fruit yield per plant (g), capsaicin content percent, days to 50% flowering and days to maturity for three consecutive years (Table S2). After harvesting the mature fruits were oven dried (45 • C for 3-4 days depending on fruit thickness) for extraction of capsaicin content. The estimation of capsaicin was done using a spectrophotometric method in triplicates (Thimmaiah, 1999), followed by their validation using uHPLC method. Two grams of dried chilli powder was dissolved in 4 mL of ethanol extract and kept in a water bath at 80 • C for 3 h, manually inverted after every hour. The samples were then kept in room temperature for cooling. The supernatant layer of each sample was filtered through Nylon 33 mm 0.45 µm filter (AxivaSchem. Pvt. Ltd., Sonipat, India). A uHPLC Ultimate 3000 (Thermo Fisher Scientific, Waltham, MA, USA) system equipped with Betasil C 18 column (particle size 3 µm, dimension 150 × 4.6 mm) was used for analysis, maintaining column temperature at 60 • C, sampler temperature at 20 • C and sample volume: 5 µL. A binary mixture of water-acetonitrile at a 50:50 ratio was used as mobile phase and the flow rate was 1.5 mL/min. The procedure for capsaicin estimation described by Bhandari et al. (2021) was used in the study.

Statistical analysis
To confirm the homogeneity of the studied environments Levene's test (1960) was performed using SPSS software (version 16.03) before pooling the data, which is given as-H O : σ 2 1 = σ 2 2 = ... = σ 2 k H a : σ 2 i = σ 2 j for at least one pair. For 'Y' variable with 'N' sample size having 'k' subgroups, Levene's test is statistically where 'Ni' is the sample size of ith subgroup. Z ij = |Yij − Yi.|, where Yi. is either the mean or median of the ith subgroup. Statistical analyses were performed using INDOSTAT software version 8.2. The data were subjected to a standard statistical method of Analysis of variance (ANOVA) for RCBD (Panse & Sukhatme, 1978). Genotypic and phenotypic coefficients of variability (GCV, PCV) were calculated following the method proposed by Burton & De-Vane (1953); Broad sense heritability (H bs ) was computed following method suggested by Allard (1960); correlation coefficient by Fisher (1954); genetic advance as per the method given by Johnson, Robinson & Comstock (1955) and path coefficient analysis by Dewey & Lu (1959).

RESULTS
Levene's test of homogeneity of variance (1960) studied for three environments presented in Table 1, where the traits did not show any significant deviation ( P ≤ 0.005) over the environments and hence homogeneity of variance assumption is reasonably met for one-way ANOVA. Further ANOVA analysis (Table 2) performed for all the traits showed highly significant differences in the studied genotypes at P ≤ 0.005.
Genotypic path study for fruit yield per plant was analyzed and presented in Table 4 where number of fruits per plant (0.880) showed maximum positive direct effect on fruit yield per plant, followed by fruit length (0.274) and fruit girth (0.200). Direct positive effects of other traits were very negligible with low residual effects (0.298), which indicates that the traits taken for the study have 70% accuracy for yield determination in ghost pepper. Path analysis for capsaicin content (Table 5) showed that highest positive direct effect on the trait was shown by fruit yield per plant (2.375), followed by days to 50% flowering (0.273), days to maturity (0.202) and plant height (0.115) ( Table 4). Leaf length showed very weak direct association (0.033) and rest of the traits showed direct negative effect. The number of fruits per plant showed maximum negative direct effect (−1.802) towards capsaicin content, indicating that both these characters cannot be improved simultaneously. Indirect contribution to capsaicin content was shown by days to 50% flowering via fruit yield per plant (1.092) and days to maturity (0.118). Fruit length also showed indirect contribution towards capsaicin content via plant height (0.108). Similarly, fruit yield per plant showed    Dendrogram constructed for genetic divergence study using Mahalanobi's D 2 analysis are shown in Fig. 1. Based on the degree of divergence the genotypes were grouped into 20 clusters. Cluster I consists of highest number of genotypes (60), followed by cluster II (16) and 3 (9), while nine genotypes came out as distinct entity. For most of the genotypes, the grouping is in accordance with their morphological characteristics, viz-RRL-BJ-102 and 18 (brown variants), RRL-BJ-20 and 25 (yellow variants), RRL-BJ-92 and 58 (round red variants) (Fig. S1). According to Mahalanobi's distance matrix average intra-cluster divergence ranged from 0 to 112.11 (Table 6) with maximum distance observed within cluster I (60 genotypes) and minimum for cluster V, XII, XIV, VIII, XV, XVIII, XVII, XIX and XX (0.00). This indicates the presence of wide genetic diversity among the genotypes of the cluster, while the genotypes in solitary clusters can serve as potent parents owing to their diverge traits, which separates them from genotypes in other clusters. Inter-cluster divergence ranged from 11.48-691.79, with highest distance (691.79) seen between cluster I and XIX, followed by cluster XV and XIX (689.64), cluster VII and XIX (677.99) and cluster XVI and XIX (663.10). Minimum intercluster distance (11.48) was observed between cluster III and XIII.
The percentage contribution of each trait towards total divergence was studied and presented in Table 7, where fruit yield per plant showed highest contribution towards genetic divergence (17.498%) followed by fruit length (11.390%), plant height (11.836%), days to 50% flowering (10.921%), fruit girth (9.297%), leaf length (8.672%), capsaicin content (8.554%) and leaf breadth (6.705%). While the traits number of main branch and days to maturity showed minimum contribution towards total divergence. Principal Components Analysis (PCA) was studied to determine the largest contributor of variation at different differentiation axis. PCA study revealed that about 73.48% of the variability is explained by the four principal components (F1, F2, F3 and F4) which have eigen value greater than 1. The first component added to 34.59% of the total variability, contributed by traits with high positive values viz-fruit yield per plant, followed by plant height, fruit length, days to 50% flowering, fruit girth, leaf length, capsaicin content, number of fruits per plant and leaf breadth (Fig. 2). The variability in this component is mostly associated with fruit characteristics. About 16.81% of the variability is added by the second component with strong positive contribution from traits-leaf length, leaf breadth, fruit girth and fruit length. The third and fourth components added to 12.13% and 9.96% of the total variability with strong contribution from the traits number of fruits per plant and fruit yield per plant for 3rd component and leaf breadth, and the traits days to maturity, leaf length and days to 50% flowering for 4th component, respectively. The biplot of the axes (PC1 and PC2) comprising about 51.39% of the variability showed that high PC1 and low PC2 is required for selection of genotypes with high fruit yield and capsaicin content. PC 1 showed positive correlation with all the studied traits while PC2 showed positive correlation with traits leaf length, leaf breadth, fruit length and girth and plant height.     Levene's test (Levene, 1960) is a hypothetical test performed before ANOVA to test whether the studied environments have same or different variances. If the researched environments have distinct variants, then further research will be ineffective since this will result in large discrepancies across all of the settings. In the present study, Levene's test of homogeneity of variance was reasonably met for ANOVA analysis to be performed. ANOVA study conducted for three years pooled data showed significant differences among the studied genotypes. It indicates that the genotypes have significant variation for the studied traits, which will be helpful in the selection of various traits for crop improvement. The presence of genetic variability is the basis of all improvement programmes. High variation indicates the presence of higher variability for the traits which provides greater scope for improvement through pure line selection (Nahak et al., 2018). In the present study high variability parameters (i.e., GCV, PCV) were obtained for fruit yield per plant, number of fruits per plant, capsaicin content, leaf breadth, fruit length and fruit girth. The present result is supported by the previous work of Ibrahim, Ganiger & Yenjerappa (2001), Manju & Sreelathakumary (2002), Nagaraju et al. (2018) and Negi & Sharma (2019. GCV and PCV estimates are done to determine the environmental effect on various traits. In the study it was observed that except for days to 50% flowering and days to maturity all the traits showed lower GCV estimates than PCV, indicating the great environmental influence on the expression of these traits and so selection should be made carefully considering the environmental changes, as suggested by Lal, Gupta & Dubey (2017). Singh & Kumar (2005) suggested that for more efficient selection process, heritability should be studied along with variability. According to Singh (2001) heritability of a trait is high when it is 80% or above, moderate when it ranged between 40-80%, and low when less than 40%. Based on these criteria seven traits showed high heritability, four traits showed moderate heritability, while low heritability was observed only for the trait number of main branch. Genetic advance is the enhancement in base population that can be possibly made from selection of a trait. Lal et al. (2013) suggested that high heritability along with high genetic advance is most preferred as these traits were controlled by additive gene action. Accordingly, five traits viz-fruit yield per plant, number of fruits per plant, capsaicin content, fruit length and fruit girth showed high heritability with high genetic advance and hence selection will be fruitful for these traits in early generations in ghost pepper. The present results were in agreement with the findings of Sharma, Semwal & Uniyal (2010), Meena et al. (2016) and Nagaraju et al. (2018) in C. annum. High heritability with high genetic advance was also observed for fruit length and girth in ghost pepper (Mena et al., 2019). Moderate heritability with low genetic advance observed for days to maturity (4.73%) indicated the occurrence of non-additive gene action, thereby making it difficult for improvement through direct selection. In such cases, improvement can be achieved using other breeding methods like-mutation, hybridization etc.
Correlation study for fruit yield per plant showed positive and highly significant correlation with nine traits, which is consistent with the reports of Mena et al. (2018), Devi et al. (2018) and Ozukum et al. (2019) in ghost pepper. The correlation study for capsaicin content was also consistent with the reports of Datta & Jana (2010) and Vaishnavi, Khanm & Bhoomika (2017) in different Capsicum species. Genotypic path study for fruit yield per plant has been studied because the results at phenotypic level may not provide appropriate results of direct and indirect relation of the component traits (Negi & Sharma, 2019). Direct positive effect of number of fruits on fruit yield of ghost pepper was also previously reported by Devi et al. (2018). Further, our results were similar to the findings of Sharma, Semwal & Uniyal (2010), Bijalwan & Mishra (2016) and Vaishnavi, Khanm & Bhoomika (2017) in different Capsicum species. Therefore, number of fruits per plant should be taken as important selection criterion for the ghost pepper improvement programme. Path study for capsaicin content showed direct effect from days to 50% flowering, which was similar to the reports of Mini & Vahab (2002) and Misra et al. (2010) in C. annum.
Mahalanobi's D 2 statistics (Mahalanobi, 1936), which is a most ideal tool for determining genetic divergence, was used in the present study where many clusters were formed irrespective of their place of collection. This formation of many individual clusters based on morphological characterization indicates the presence of sufficient genetic variability among them. In C. annum, Ajjappalavara (2009), Kumari et al. (2010) and Datta & Jana (2011) also observed formation of large number of clusters, indicating the presence of wide variability in the studied material. According to Patel et al. (2017) genotypes from highly divergent clusters should be used for the development of high-yielding varieties. Based on these criteria genotypes in clusters I and XIX would be more fruitful for producing segregates with great heterotic effects as they produced maximum distance between them. Further inter-cluster range observed in the present study was much higher than the results obtained by Yatung et al. (2014) in C. annum.
The study of different traits contribution towards genetic divergence is important with regards to selection and choice of parents for hybridization programme and is performed on the basis of D 2 values. The trait like fruit yield per plant can be taken into consideration during selection of parents for hybridization as it showed highest contribution towards genetic divergence. It was followed by the traits fruit length, plant height, days to 50% flowering, fruit girth, leaf length, capsaicin content and leaf breadth. The maximum contribution by similar traits towards total divergence was also reported earlier by Hasan et al. (2015) in C. annum. PCA analysis was done to determine the largest contributor of variation at different differentiation axis (Sharma, 1988). According to Pandey & Bhatore (2018) higher the percentage contribution of traits towards divergence, the more effective it will be in recovering transgressive segregates using multiple cross over. Afuape, Okocha & Njoku (2011) suggested that for an effective breeding, only the components having eigen value greater than one should be taken for determining traits that can produce phenotypic difference. The PCA results in the present study showed majority of the variability was explained by the first four principal components, which was similar to the reports of Sarmah, Sarma & Gogoi (2018) for different chili varieties of Assam. In a recent study on C. annum, it was observed that the percentage contribution by the first two principal component was 28% and 19% respectively (Karim et al., 2022). In this respect, the current study showed high percentage contribution from the first two components, together accounting for 51.39%. Study by Azeez & Morakinyo (2011) showed that for selecting genotypes with high fruit yield and capsaicin content high PC1 components should be taken into consideration, which implies that the traits contributing more in the first principal component should be given preference. Accordingly, the traits-fruit yield per plant, plant height, fruit length, days to 50% flowering, fruit girth, leaf length, capsaicin content, number of fruits per plant showed positive contribution in the first principal component. Based on three-year detailed study, the genotypes-RRL- 120,118,70,60,66,5,18,88,65, and 106 are considered superior for capsaicin content and fruit yield and can be opted for large-scale evaluation.

CONCLUSIONS
The two most significant characteristics of Capsicum chinense Jacq. crop in terms of economics are pungency and yield. Recent years witnessed a significant decline in popularity of this important crop due to the use of inferior planting material and lack of elite lines. It is therefore necessary to develop suitable elite lines with promising characteristics to meet the quality requirements. The current study aimed to identify the essential factors for selecting superior ghost pepper lines. Priority should be given to traits like high fruit yield per plant, fruit length, fruit girth, days to 50% flowering, days to maturity and plant height as these will either directly or indirectly influence varietal selection when developing a breeding programme for the development of ghost pepper. All these traits showed positive correlation, high heritability and genetic advance which are considered as important criteria for selection of elite lines. And based on percent contribution towards total divergence and PCA data, eleven genotypes 120,118,70,60,66,5,18,88,65,and 106) were selected with high yield and capsaicin content. The lines can be opted for large-scale cultivation before going for variety recommendation and will be distributed for farmers'trial. To the best of our knowledge, this is the first detailed study on variability estimation and genetic divergence with three-year evaluation and incorporating large number of genotypes.

ANOVA
Analysis of variance